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Abstract. 

The notion of 'presentation', as used in combinatorial group theory, is ap- 
plied to coded character sets(CCSs) — sets which facilitate the interchange of 
messages in a digital computer network(DCN) . By grouping each element of 
the set into two portions and using the idea of group presentation( whereby 
a group is specified by its set of generators and its set of relators) , the 
presentation of a CCS is described. This is illustrated using the Extended 
Binary Coded Decimal Interchange Code(EBCDIC) which is one of the most 
popular CCSs in DCNs. 

Key words. Group presentation, coded character set, digital computer 
network 
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1. Introduction. 

Most of the data which are presented to a digital computer system(DCS) 
are in the form of records and these records are usually entered as alphanu- 
meric characters. For every number which is manipulated by a DCS, ten 
alphabetic characters are processed [13]. Mathematically, a coded character 
set (CCS) arises as a result of a mapping between the set of binary dig- 
its(bits) and the set of characters ; the emphasis in this work is on the set 
of bits(codes) . Thus, the set of bits of a CCS, and in effect a CCS, is struc- 
turally a set of sequences. The ordering in the set is according to the collating 
sequence( i.e the natural sequence of appearance in the set). Coded character 
sets(CCSs) are important in a DCS because they provide a means of repre- 
senting alphanumeric characters(i.e numerals, letters, punctuation characters 
and control codes) as fixed sequences of zeros and ones. The sets are normally 
utilized in the sixth layer(i.e presentation layer) of the seven open system in- 
terconnection(OSI) layers of computer network [34]. Two popular examples 
of such sets are the 7 — bit American Standard Code for Information Inter- 
change(ASCII) consisting of 128 characters and the Extended Binary Coded 
Decimal Interchange Code(EBCDIC) consisting of 256 characters. Although 
a CCS is a code(i.e it is a code mapped unto a set of characters), it is not an 
error detection or correction(EDC) code and also, it is not always a group 
code . A well-known result is that a code is linear iff it is a group [4] . A lot 
of the literature has been devoted to studies of the mathematical properties 
of linear and nonlinear codes and of the construction of EDC codes e.g see 
[1, 10, 11, 19]. In particular, some properties of group codes are discussed in 
[21]. Also, traditional studies on the properties of CCSs have focused on the 
description of the characteristics of the codes in terms of shiftedness, BCD 
for numerics, BCD for alphabetics, numerics in numeric sequence, signed nu- 
merics and the matching of collating sequence with the bit sequence [25]. For 
instance, the EBCDIC is not shifted and its alphabetics are not in contiguous 
sequence. However, it posseses all the other properties mentioned above . In 
some of his earlier papers [e.g 29, 31], the author applied the idea of equiv- 
alence relation to ordinary differential equations of the form xl = g[x) = 
Y%=i cax n ~\ a>i G 3? and constructed codes using the quaternary system 
{Attractor(A), Repellor(R), PositiveShunt(P), Negatives hunt (N)} where 
A, R, P and N are the possible phase portraits for (a linear) equation on 
the line. The blocklength of a code constructed this way is equal to the de- 
gree n of the polynomial. In [6], a geometric technique for constructing codes 
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using the black / white lift of a cap was presented. This is accomplished by 
partitioning a set of points which have no three of its points collinear. In the 
present paper however, a new approach to the study of CCSs (and binary 
uniform digital codes in general) is discussed and pertinent theorems on the 
approach presented. The approach, called 'code presentation', uses the con- 
cept of a partition and an imitation of the idea behind 'group presentation'. 
This enables a CCS to be described in a simple form in terms of a 'zoned 
set' and a 'decinumer set' just as a group is described in terms of a 'set of 
generators' and a 'set of relators' [7, 12 , 14 , 18 , 22, 24 , 26, 27, 33]. In 
particular, it is shown in [27] that a generalized free product of two finitely 
presented groups acting on trees(a non-primitive computer data structure) 
is finitely presented iff the amalgamated subgroup is finitely generated. The 
general applicability of the technique of code presentation to codes had been 
discussed in [30] . As is well-known, the idea behind (group) presentation, in 
itself, has been applied to a number of areas in mathematics and computer 
science including geometry, topology, C*-algebra, knot theory, automorphic 
functions [26], computable algebra [15] and other areas. Presentation in its 
ordinary meaning refers to the depicting or writing of an algebraic structure 
in a simple form. In particular, in the Theory of Computation , the pre- 
sentation of a function refers to a definition which gives an effective method 
for computing the function [5]. In Formal Language Theory, the concept 
of 'generation' (or derivation) is associated with the phenomenon in which a 
language may be generated by a phrase structure grammar [16]. Although 
the term 'Source Code Presentation' exists in Software Engineering, it is 
used in a different setting to describe the readability of the source code of a 
program [35]. The beauty in group presentation , apart from its simplicity 
, is that it assists in deriving information about an algebraic structure from 
its presentation. 

2. Group Presentation. 

Code presentation may be indirectly viewed as another area of applica- 
tion of group theory. Generally, group presentation per se is unsuitable for 
presenting CCSs since not all binary uniform digital codes (C, +) are groups 
where C is a typical code and '+' the binary operation defined on it. That 
is, the four basic axioms of a group(namely closure, existence of an identity, 
existence of an inverse and associativity) are not always satisfied in a code 
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. For instance, the simple code {00000, 01100, 00110, 11000, 11001, 11011} of 
order six and blocklength five is not a group because it is not closed. The 
methodology of code presentation used in this paper is based on the premise 
that a typical modern day digital computer has an architecture in which 
the bit patterns of a memory location in the main memory are addressed in 
bytes i.e in multiples of bits. The following are well-known results on group 
presentation [8, 26] : 

Theorem 2.1. 

(i) Every group has at least one set of generators, namely itself . 

(ii) Every group necessarily has a presentation. In particular, every finite 
group has a finite presentation. 

(iii) A subgroup of a finitely presented group need not be finitely presented . 

(iv) A group can have many presentations . 

(v) Given an arbitrary set of symbols and an arbitrarily prescribed set of 
words in these symbols, there exists a unique group, up to isomorphism, 
having the symbols as generators and the set of prescribed words as defining 
relators. (This result always allows new groups to be constructed). 

Two examples of group presentation satisfying the above are: 

(i) Symmetric group of degree n(S n ) [22] 

n— 1 1 In 

where /„ is the set of relations of = 1 for i G {1,2, ...,n — 1} and B n is the 
set of relation OiOi+\Oi = <Ji + x<Ji<Ji+x if % e {1,2, ...,n — 1} and OiOj = OjOi 
if i, j G {1, 2, n — 1} and \i — j\ > 2. Under the above isomorphism ,<7j is 
taken to be the transposition + In particular, the presentation of the 
symmetric group on three letters( S3 ) is [3] 

S 3 w {a, b; a 2 , b 3 , aT l bab~ 2 } 

(ii) The Mathieu group Mu [12] 
(a) 

{a, b, c, d, e; aa, bb, cc, dd, ee, bdbd, bebe, (ab) 3 , (de) 3 , (be) 5 , acece, a(cd) 3 } 
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(b) 

{a, b, c, d; aa, bb, cc, dd, (ab) 3 , (be) 3 , (cd) 3 , (abdbd) 2 , (cbdbd) 3 } 

(c) 

{a, b, c, d; aa, bb, cc, dd, acac, adad, (ab) 3 , (be) 3 , (cd) 3 , (bd) 5 , (abd) 5 , (bedbede) 3 } 

3. Code Presentation Theorems. 

Let w = W1W2 be a juxtaposed word of a uniform digital code C of order 
k and blocklength n such that W\ = aia 2 ...a s and w 2 = a s+ ia s+2 ...a n . Then 
w\ and w 2 are respectively called the zoned portion and numeric portion of 
w. If n is even we let s = n/2 and if n is odd, we let s = (n + l)/2 or 
s = (n — l)/2. The Type I definition of zoned portion for odd n is depicted 
by the case when s = (n + l)/2 while the case in which s = (n — 1)/2 describes 
the Type II definition of zoned portion. A constant zoned portion refers to 
a zoned portion which is the same for two or more words. Let the ordering 
on the code be according to the collating sequence. Then the ordered set of 
all the constant zoned portions of C is called the zoned code (or zoned set). 
The numer code (or numer set) of C is the ordered set of all the numeric 
portions of the code. We now suppose E is a subset of C. Then E is said to 
be an equizone of C if all words in E have a single constant zoned portion. 
The degree of C refers to the number of equizones in it. A decinumer of E 
refers to the decimal value of a numeric portion of E while the ordered set of 
all the decinumers of an equizone is called a decinumer set [30]. 

Theorem 3.1. 

Every coded character set C can be well-ordered. 

Proof. 

A CCS is a set. By the well-ordering principle [20], every set can be well- 
ordered. For c\,ci G C, define c\ < c 2 iff the decimal value of c\ is less than 
or equal to the decimal value of c 2 based on the collating sequence of C. It is 
easily seen that C is a chain and every subset of it contains a first element. 
Therefore, a CCS can be well-ordered . 

Definition 3. 2 (Fundamental Definition of Code Presentation). 

Let a uniform digital code C has a degree d and suppose Ei is a typical 
equizone of decinumer set Qi where i < d. If Zi is the constant zoned portion 
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of Ei and Xi g the bit pattern of g G Qi, then the code presentation of C is 
given by: 

d 

C{P}=\J{z i x ig VgeQ i } (3.2) 
i=i 

Definition 3.3. 

Let »i = a 1 \a 1 2---0'in an d w 2 = 021022- --^n be two words of a coded 
character set and V a binary operation. Then * w 2 is defined as 

w 1 *w 2 = (an * o 2 i) (012 * 022) (oin * a 2n) (3.3a) 

. In particular, the word difference ( - ) between w± and w 2 , denoted by 
Wi — W2, is given by 

w 1 -w 2 = (an - a 2 i)(ai 2 - a 22 )(ai n - a 2n ) (3.36) 

where 



_ = ( if Ojj = a fci 
lJ fcj I 1 otherwise 



Remark 3.4. 



Addition in computer arithmetic is normally defined by: + = 0; 
+ 1 = 1 ; 1 + = 1 and 1 + 1 = while multiplication is normally defined 
by : 0.0 = ; 0.1 = ; 1.0 = and 1.1 = 1[9] 

Proposition 3.5. 

W\ - W 2 = Wi + w 2 

Proof. 

Obvious; it follows from the fact that a^ — a^j = — a>ij = aij + a^- 
Theorem 3.6. 

Let C be a uniform digital code and T the zoned code of C. Given a,b e C 
and z a , Zb G T, let a ~ b iff z a — ^ = , where z a , Zb are respectively the 
zoned portions of a and b and '— ' is the word difference. Then ~ defines an 
equivalence relation 



Proof. 

This follows from the fact that ~ is reflexive, symmetric and transitive. 
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The distinct equivalence classes of ~ are the equizones. By virtue of the 
Fundamental Theorem of Equivalence Relations, it follows that the set of all 
these equivalence classes gives a partition of the code C. 
Definition 3.7. 

Let C = {Xi, X 2 , X e } be an ordered code of order e where X$ = 
e {0, 1}, j = 1,2, ...,n is a word of blocklength n . Then 
the inverse of C, denoted by C -1 , is an ordered code given by C~ x = 
{X n ,X n _ 1 ...,X 1 }. The inverse of X i: denoted by is given by X^ 1 = 



Theorem 3.8. 

(i) Every coded character set (CCS) necessarily has a presentation. 

(ii) The order of the zoned set of a coded character set is finite. 

(iii) Every subset of a coded character set has a finite presentation. 

(iv) The presentation of a coded character set is not necessarily unique. 

(v) A coded character set can have at most two presentations. 



(i) Since every CCS can be written in terms of a zoned portion and a 
numeric portion , then the result follows, (ii) A CCS is finite . Therefore it 
has a finite number of zoned portions, (iii) This follows from the fact that 
every subset of a CCS is finite, (iv) When a CCS has an odd blocklength , 
then its zoned portion, by definition, can be described in two ways. It then 
follows that the CCS has two distinct presentations, (v) The two distinct 
presentations in the proof of (iv) are the only possible presentations. They 
are the maximum possible presentations. 

4. Example. 

The EBCDIC is one of the two most popular CCSs in digital computer 
systems [2]. It has 16 equizones namely E , E 1 , Eg , Ea , E B , ...E F . The 
zoned set of the code is the 4-bit hexad set Hf 6 = {0000, 0001, 0010, 0011, 
1110, 1111} . Table 1 gives the decinumer set of each equizone of the EBCDIC 
code where '— ' is the set difference. In the table, U is the decinumer set 




Proof. 
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which corresponds to equizone Ei while T4 represents the order-preserving 
decinumer set of Hf 6 i.e T 4 = {0, 1, 2, 14, 15}. Both equizones E\ and E 2 
have the highest number of characters (i.e 13 characters each ) while equizone 
E B with no character, has the least number of characters. 

5. Discussion and Conclusion. 

This paper has related the idea behind group presentation to coded char- 
acter sets(CCSs) in digital computer architecture . Theorems, which are 
analogies of some of the well-known results in combinatorial group theory, 
are then presented. Just as group presentation, code presentation enables 
CCSs to be written in a simple form and also assists in deriving information 
about their structure. The results in the paper arise from the fact that group 
presentation theory (or combinatorial group theory), is generally unsuitable 
for presenting CCSs since not all CCSs are groups. Apart from this, the 
bits of CCSs have some particular pattern of representation in the classical 
computer hardware [23, 25]. 
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TABLE 1: THE DECINUMER SET OF THE EBCDIC 
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LIST OF SYMBOLS 



SYMBOL 


MEANING 
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is an element of 


V 


for all 


< 


less than or equal to 


u 


union of some sets 




n-bit hexad set 


{} 


set of elements 


A-B 


the difference of two sets A and B(set difference) 


w 1 - w 2 


word difference 


T 4 


order-preserving equidext of H\ & 





empty set 



